Skip to content

fix(codegen): provenance view uses ROW_NUMBER for latest-per-entity#82

Merged
hyperpolymath merged 2 commits into
mainfrom
fix/provenance-view-latest-per-entity
May 14, 2026
Merged

fix(codegen): provenance view uses ROW_NUMBER for latest-per-entity#82
hyperpolymath merged 2 commits into
mainfrom
fix/provenance-view-latest-per-entity

Conversation

@hyperpolymath
Copy link
Copy Markdown
Owner

Summary

Per V-L2-K1: the latest-per-entity subquery used a correlated MAX(timestamp) with an ambiguous outer reference. Replace with a ROW_NUMBER() OVER (PARTITION BY entity_id ORDER BY timestamp DESC) = 1 pattern — no correlation, one row per entity by construction. Supported on PostgreSQL and SQLite ≥ 3.25.

Closes

Test plan

  • cargo clippy --all-targets -- -D warnings clean
  • cargo test --lib --bins 38/38 (1 new: test_provenance_view_uses_window_function_for_latest_per_entity)
  • Test asserts both new pattern present AND old MAX(p2.timestamp) gone

hyperpolymath and others added 2 commits May 14, 2026 15:42
Closes #41.

`verisimdb_temporal_versions` had a non-unique partial index on
`(entity_id, table_name) WHERE valid_to IS NULL`. Two concurrent
writers could both successfully insert a row with `valid_to=NULL` for
the same entity, leaving two "current" versions and breaking the
point-in-time query semantics. There was also no constraint that
`valid_to` (when set) couldn't precede `valid_from`.

  - Promote the partial index to `CREATE UNIQUE INDEX` so the storage
    layer enforces "at most one current version per (entity, table)".
  - Add `CHECK (valid_to IS NULL OR valid_to >= valid_from)` so a
    version interval can't be backwards.

Test `test_temporal_table_has_unique_partial_index_and_valid_to_check`
asserts both clauses appear in the emitted DDL with temporal enabled.

`cargo clippy --all-targets -- -D warnings` clean; 37 unit tests pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Closes #40.

`generate_provenance_view` emitted a correlated MAX(timestamp) subquery
whose outer reference `verisimdb_provenance_log.entity_id` was
ambiguous: the same table appears in multiple nested scopes, and
depending on planner choices the correlation either bound to the wrong
scope (returning all rows for the entity) or worked accidentally. The
view did not reliably return exactly one "latest" row per entity.

Replace with a window-function pattern:

    ROW_NUMBER() OVER (PARTITION BY entity_id ORDER BY timestamp DESC)

picked out as `_rn = 1`. No correlation, no scoping ambiguity, one row
per entity by construction. Supported on PostgreSQL (always) and SQLite
≥ 3.25.

Test `test_provenance_view_uses_window_function_for_latest_per_entity`
asserts the new pattern is present and the old `MAX(p2.timestamp)`
pattern is gone.

`cargo clippy --all-targets -- -D warnings` clean; 38 unit tests pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@hyperpolymath hyperpolymath merged commit b4d831f into main May 14, 2026
16 of 18 checks passed
@hyperpolymath hyperpolymath deleted the fix/provenance-view-latest-per-entity branch May 14, 2026 14:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

V-L2-K1: provenance latest-per-entity view has broken correlation

1 participant